MLOps - Implementation of Deep Learning Projects and Deployment on AWS (Git and DVC pipeline)

Vaishnavi Shankar Devadig, 2024

Background¶

Imagine constructing sophisticated machine learning models but neglecting their operational integration. MLOps bridges this gap, establishing a vital framework for seamless integration and ensuring continuous real-world impact. Encompassing aspects like deployment, monitoring, and performance adjustment, MLOps provides the tools and processes to optimize the AI ecosystem for sustained value generation. We can think of it as the invisible but crucial foundation upon which our AI castle thrives!

Git for MLOps¶

The foundation of robust MLOps lies in version control, and Git serves as the cornerstone. Just like architects meticulously track changes in building plans, Git allows us to track alterations to the code, data, and configurations. Each model iteration, code tweak, and data update gets meticulously recorded, providing a clear history and enabling easy rollbacks if needed. Imagine accidentally introducing a bug – but with Git, reverting to a previous, working version becomes seamless.

Additionally, Git facilitates collaboration, allowing different teams to work on projects simultaneously while maintaining order and preventing conflicts. It's like having a shared blueprint, ensuring everyone contributes effectively and the project evolves on the right track.

DVC Pipeline¶

MLOps often involves complex workflows, from data preprocessing to model deployment. Here's where DVC pipelines step in, acting like the conductor of the AI orchestra. By defining these reusable workflows, we ensure consistency and reproducibility in the model development process. Imagine building the model training pipeline once, specifying how data gets downloaded, preprocessed, and fed to the model. With DVC, we can reuse this pipeline on different datasets or model versions, saving time and guaranteeing consistency. Moreover, DVC integrates seamlessly with Git, versioning our pipelines alongside the code and data. This creates a transparent and reliable record of how the models were built, enabling easier troubleshooting and collaboration. DVC thus orchestrates your complex AI processes, ensuring efficiency and repeatability throughout the MLOps lifecycle.

Implementation¶

I have used the skin cancer classification dataset from Kaggle and created a very simple deep learning project. The main aim of the project is to demonstrate MLOps capabilities using Git an DVC, hence the deep learning project in particular is a basic one.

Github Repository¶

https://github.com/vdevadig/Exe-Project

The repo contains the entire hierarchy used to implement this project.

Pipeline Stages¶

  • Data Ingestion
  • Preparing the base model
  • Training (Also includes the callback stage.)
  • Evaluation

Data Ingestion¶

In the data ingestion phase of the MLOps pipeline for the skin cancer classification project, data is sourced from Kaggle. This stage involves the acquisition and initial handling of the dataset, preparing it for further processing and analysis. Given the dataset's direct application to skin cancer classification, it's crucial to ensure the data's relevance and quality. However, the immediate need for data validation can be negated. The dataset is then preprocessed, where data is formatted and prepared to meet the specific requirements of the model architecture.

Preparing the Base Model¶

The project leverages the VGG-16 architecture as the base model, chosen for its effectiveness in image recognition tasks. VGG-16, with its deep convolutional network, is particularly adept at capturing complex patterns in image data. In preparing the base model, INCLUDE_TOP is set to False to customize the network's top layers for our specific binary classification task. The model is initialized with ImageNet weights (WEIGHTS: imagenet) to leverage pre-learned patterns.

Training¶

The training phase is a critical component of the pipeline, where the model learns to classify skin cancer from images. This stage incorporates data augmentation (AUGMENTATION: True) to introduce variability into the training data, enhancing the model's generalization capabilities. The model processes images resized to [224, 224, 3] with a batch size of 16, iterating over the dataset for a total of 5 epochs. The choice of a relatively high learning rate of 0.01 aims to expedite the convergence process. Training also includes a callback stage for monitoring the model's performance.

Markdown¶

Upon completing the training, the model undergoes evaluation to assess its performance in classifying skin cancer. The evaluation metrics include loss and accuracy, providing insights into the model's predictive capabilities and the degree to which it has successfully learned from the training data.

Post setup of DVC Pipeline¶

  • Setup an IAM user on AWS specifically meant for pipeline and deployment purposes only.

  • Created an Elastic Container Registry Repository to store Docker images for the project.

  • Created an Ubuntu EC2 machine with t2.xlarge configuration on AWS and installed Docker on it.

    image.png

  • Configured EC2 as a self-hosted runner on Github and declared the github secrets like the access, secret access keys, region, ECR URI, and ECR Repository name.

  • Once done, pushed the final changes to the Github repository and Self-hosted Runner started its job.

    Screenshot 2024-02-18 at 12.38.13 PM.png

    Screenshot 2024-02-18 at 12.38.35 PM.png (That PW is Please Work, by the way, because this was third attempt to get my runner working :| )

    image.png

    image.png

    image.png

Deployed Project¶

Below are the images of the deployed Deep Learning Project:

Correct Predictions.

Screenshot 2024-02-18 at 1.06.35 PM.png

Screenshot 2024-02-18 at 1.06.58 PM.png